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AMENDMENTS TO THE SPECIFICATION 



Please amend the paragraph at lines 4-6 on page 1, under the subheading 
"Related Applications", as follows: 

— This application in n r.mitiniimg application o f claims priority to commonly-owned 
and co-pending U.S. Provisional Application Nos. 6Q/14L2Q8. filed June 25. 1999, and 
60/188.659 . filed March 10, 2000, each of which is hereby incorporated by reference. - 



Please replace the paragraph beginning on page 6, line 14 and ending on page 
7, line 4, as follows; 

— The speed of the processors used in a cluster of the invention has a direct 
bearing on the overall speed of the cluster. Likewise, the number of processors used in 
a cluster is also a factor in determining the overall speed of the cluster. The third factor that 
has a bearing on cluster speed is known as "architectural overhead," The architecture of a 
cluster not only includes the physical topology of the system, but also the software 
technology employed in the connectivity, which dictates how efficiently data is being 
distributed between processors in a cluster. Connectivity is typically not one hxindred percent 
efficient, and it is the overhead associated with connectivity that slows the cluster. The prior 
art used in a complex router, plus a group of partial differential equations, to distribute 
intemodel intemodal data. This complexity of the prior art mefte sindicates that the addition of 
one processor provided less than one processor's worth of computer power. Further, in the 
prior art, the efficiency of adding additional processors decreased with the number of 
processors added. The invention described herein uses the logical topology of a switch^ 
interconnected Howard Cascade to ensure that fi-ee physical lines are always available. Using 
a linear load factor of approximately 1 .2 is a reasonable overhead estimate associated with 
the invention. This factor is quite efficient when compared to prior art allocation methods, 
and is the result of using a mathematical cascade in allocating the problem sets among nodes 
for processing and defining intemode timing. — 

Please replace the paragraph spanning page 1 5, line 37 through page 17, line 
as follows: 
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— Because the data distribution problem becomes more difficult when the nodes 
{processors or threads) are passing information amongst themselves, the foUoAving 
description provides for how this may be handled. There are four known methods of passing 
information: one to one, one to many, many to many, and many to one. The one to one 
method of information passing requires a single pair of nodes known as the source and 
destination nodes. In order to pass information effectively both the source and destination 
nodes need to be time synchronized. As can be seen in FlGs- 3A and 3B those nodes shown 
at time step 2 are time synchronized as well as those nodes shown at time step 3, Further, 
each of these nodes is guaranteed to have a clear channel. Thus as long as the pairs are time 
aligned via the Howard Cascade, point to point data transfer occurs without waiting and with 
clear a priori node pairing locations. The one to many information passing method works 
similarly to the one to one information transfer method. As long as the associated time 
synchronized nodes are used in the one to many information passing then we are guaranteed 
to have clear channels among all of the nodes. Because the invention preferably use TCP/IP 
for intomodel i ntemodal communication, for one to many infonnation transfer the IP 
broadcast and multicast options are used to send data in parallel to the proper destinations. 
Many to many information transfer performs like several one to many information transfers. 
Many to one infonnation transfers require different processing depending on why the many 
to one relationship exists. For example: representing a multiple precision value of a 
transcendental function requires that function to be represented as a series where components 
of the series are added/subtracted to produce a single multiple precession instance of the 
function. Since all components of the series are summed, this can be thought of as a many to 
one relationship. However, the summing can take place as several discrete one to one 
relationships, thus any many to one relationship which can be represented as a discrete one to 
one relationships will be. Almost all current mathematical functions which are used in 
parallel processing fall into this category. Those few that do not will not run efficiently on 
any parallel processing system. The above description me^ indicates that the invention is 
able to handle mathematical functions used in parallel processing more efficiently than the 
existing art. Those skilled in the art should appreciate that other clusters, cascades and 
supercomputing — 

Please amend the paragraph at lines 10-12 on page 20 as follows: 

The combination of the rapid allocation, automatic bandwidth matching, and 

efficient problem set distribution me^t ssignifies that machines of different types and speeds 
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can be used effectively within the cluster, providing a significant advantage of the invention. 



Please amend the paragraph at lines 18-25 on page 21 as follow^s: 

— FIG- 4C illustrates the relationship between cluster "strips'' and threads. The 
number of threads in a "strip" equals the associated cascade level less one. For example, the 
number of threads in Stripi is 3. This is equivalent to cascade level 2. The number of threads 
in Strip2 is 1, which is equivalent to cascade level 1. The number of threads in Strips is 0, 
which is equivalent to cascade level 0. To cover a full strip's worth of data, the first strip 
needs to have addresses for the other nodes in a cascade. Given a cascade of level X, a strip at 
Xn ffteage signifies that y = c (2 ^"'-1 ), where N is the cascade level equivalent for a strip. — 

Please replace the paragraph beginning on page 24, line 28 and ending on page 
25, line 9, as follows: 

— The invention provides many processing steps. These steps are explicitly 

set forth in th e provioional applioations U.S. Provisional Application Nos. 60/141 >208. 
filed June 2 5. 1999. and 60/188-659. filed March 10. 2000, and specifically within source 
code therein, from which this application claims priority. Nevertheless, FIGs. 5, 5A, SB, 5C, 
5D, 5E, SF, 5G illustrate certain flowcharts illustrating application of the invention from an 
end user computer (e.g., a computer at customer entity 22, FIG. 1), to the home node , and to 
the cluster (e.g., such as the Cluster 26, FIG. 1). Specifically, FIG* 5 illustrates process 
methodology occurring at the end user computer, FIG. SA illustrates process methodology 
occurring at the home note, and FIG. 5B illustrates process methodology occurring at the 
cluster. FIG* 5C, FIG- 5D, FIG. 5E, FIG. 5F, FIG. 5G illustrate library actions and 
interconnections (e.g,, connectors A-F) with FIG, 5, FIG. 5A, FIG. SB and other flowcharts 
of FIGs. 5C-5G. As used within these flowcharts: 'a Howard Strip" aaeaft eindicates the 
associate nodes/threads which receive a common list of TCP/IP addresses for communication 
purposes; "queue'* meafte denotes a list of system resources (threads, memory, sockets, etc.); 
"quiesce" m e an to e nsmr o signifies ensuring that all threads of an associated note have ceased 
processing problems; ''head and tail pointers" meaft denote the link list of the list pointer 
(head pointer) and the link list end of list pointer (tail pointer); and "parallelization" 
me^t sdenotes the conversion of a mathematical function or algorithm to its mathematical 
series equivalent, — 
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